Expand description
This crate provides the Repository
abstraction which serves as a hub into all the functionality of git.
It’s powerful and won’t sacrifice performance while still increasing convenience compared to using the sub-crates individually. Sometimes it may hide complexity under the assumption that the performance difference doesn’t matter for all but the fewest tools out there, which would be using the underlying crates directly or file an issue.
§The Trust Model
It is very simple - based on the ownership of the repository compared to the user of the current process Trust is assigned. This can be overridden as well. Further, git configuration files track their trust level per section based on and sensitive values like paths to executables or certain values will be skipped if they are from a source that isn’t fully trusted.
That way, data can safely be obtained without risking to execute untrusted executables.
Note that it’s possible to let gix
act like git
or git2
by setting the open::Options::bail_if_untrusted() option.
§The prelude and extensions
With use git_repository::prelude::*
you should be ready to go as it pulls in various extension traits to make functionality
available on objects that may use it.
The method signatures are still complex and may require various arguments for configuration and cache control.
Most extensions to existing objects provide an obj_with_extension.attach(&repo).an_easier_version_of_a_method()
for simpler
call signatures.
§ThreadSafe
Mode
By default, the Repository
isn’t Sync
and thus can’t be used in certain contexts which require the Sync
trait.
To help with this, convert it with .into_sync()
into a ThreadSafeRepository
.
§Object-Access Performance
Accessing objects quickly is the bread-and-butter of working with git, right after accessing references. Hence it’s vital to understand which cache levels exist and how to leverage them.
When accessing an object, the first cache that’s queried is a memory-capped LRU object cache, mapping their id to data and kind.
It has to be specifically enabled a Repository
.
On miss, the object is looked up and if a pack is hit, there is a small fixed-size cache for delta-base objects.
In scenarios where the same objects are accessed multiple times, the object cache can be useful and is to be configured specifically
using the object_cache_size(…)
method.
Use the cache-efficiency-debug
cargo feature to learn how efficient the cache actually is - it’s easy to end up with lowered
performance if the cache is not hit in 50% of the time.
§Terminology
§WorkingTree
and WorkTree
When reading the documentation of the canonical gix-worktree program one gets the impression work tree and working tree are used interchangeably. We use the term work tree only and try to do so consistently as its shorter and assumed to be the same.
§Plumbing Crates
To make using sub-crates and their types easier, these are re-exported into the root of this crate. Here we list how to access nested plumbing crates which are otherwise harder to discover:
git_repository::
§libgit2
API to gix
This doc-aliases are used to help finding methods under a possibly changed name. Just search in the docs.
Entering git2
into the search field will also surface all methods with such annotations.
What follows is a list of methods you might be missing, along with workarounds if available.
git2::Repository::open_bare()
➡ ❌ - useopen()
and discard if it is not bare.git2::build::CheckoutBuilder::disable_filters()
➡ ❌ (filters are always applied during checkouts)git2::Repository::submodule_status()
➡Submodule::state()
- status provides more information and conveniences though, and an actual worktree status isn’t performed.
§Integrity checks
git2
by default performs integrity checks via strict_hash_verification()
and
strict_object_creation
which gitoxide
currently does not have.
§Feature Flags
There are various categories of features which help to optimize performance and build times. gix
comes with ‘batteries included’ and everything is
enabled as long as it doesn’t sacrifice compatibility. Most users will be fine with that but will pay with higher compile times than necessary as they
probably don’t use all of these features.
Thus it’s recommended to take a moment and optimize build times by choosing only those ‘Components’ that you require. ‘Performance’ relevant features should be chosen next to maximize efficiency.
§Application Developers
These are considered the end-users, all they need to tune is Performance
features to optimize the efficiency of their app, assuming they don’t use gix
directly. Otherwise, see the Library Developers
paragraph.
In order to configure a crate that isn’t a direct dependency, one has to make it a direct dependency. We recommend
gix-for-configuration = { package = "gix", version = "X.Y.Z", features = […] }
to make clear this dependency isn’t used in code.
§Library Developers
As a developer of a library, you should start out with gix = { version = "X.Y.Z", default-features = false }
and add components as you see fit.
For best compatibility, do not activate max-performance-safe
or any other performance options.
§Bundles
A bundle is a set of related feature toggles which can be activated with a single name that acts as a group. Bundles are for convenience only and bear no further meaning beyond the cargo manifest file.
basic
(enabled by default) — More fundamental components that most will be able to make good use of.extras
(enabled by default) — Various additional features and capabilities that are not necessarily part of what most users would need.comfort
(enabled by default) — Various progress-related features that improve the look of progress message units.
§Components
A component is a distinct feature which may be comprised of one or more methods around a particular topic. Providers of libraries should only activate the components they need.
-
command
— Provide a top-levelcommand
module that helps with spawning commands similarly togit
. -
status
— Obtain information similar togit status
. -
interrupt
— Utilities for interrupting computations and cleaning up tempfiles. -
index
— Access to.git/index
files. -
dirwalk
— Support directory walks with Git-style annoations. -
credentials
— Access to credential helpers, which provide credentials for URLs. -
worktree-mutation
— Various ways to alter the worktree makeup by checkout and reset. -
excludes
— Retrieve a worktree stack for querying exclude information -
attributes
— Query attributes and excludes. Enables access to pathspecs, worktree checkouts, filter-pipelines and submodules. -
mailmap
— Add support for mailmaps, as way of determining the final name of commmiters and authors. -
revision
— Make revspec parsing possible, as well describing revision. -
revparse-regex
— If enabled, revspecs now support the regex syntax like@^{/^.*x}
. Otherwise, only substring search is supported. This feature does increase compile time for niche-benefit, but is required for fully git-compatible revspec parsing. -
blob-diff
— Make it possible to diff blobs line by line. Note that this feature is integral for implementing tree-diffs as well due to the handling of rename-tracking, which relies on line-by-line diffs in some cases. -
worktree-stream
— Make it possible to turn a tree into a stream of bytes, which can be decoded to entries and turned into various other formats. -
worktree-archive
— Create archives from a tree in the repository, similar to whatgit archive
does.Note that we disable all default features which strips it off all container support, like
tar
andzip
. Your application should add it as dependency and re-activate the desired features.
§Mutually Exclusive Network Client
Either async-*
or blocking-*
versions of these toggles may be enabled at a time.
For this reason, these must be chosen by the user of the library and can’t be pre-selected.
Making a choice here also affects which crypto-library ends up being used.
async-network-client
— Makegix-protocol
available along with an async client.async-network-client-async-std
— Use this if your crate usesasync-std
as runtime, and enable basic runtime integration when connecting to remote servers via thegit://
protocol.blocking-network-client
— Makegix-protocol
available along with a blocking client, providing access to thefile://
,git://
andssh://
transports.blocking-http-transport-curl
— Stacks withblocking-network-client
to provide support for HTTP/S using curl, and implies blocking networking as a whole, making thehttps://
transport available.blocking-http-transport-curl-rustls
— Stacks withblocking-http-transport-curl
and also enables therustls
backend to avoidopenssl
.blocking-http-transport-reqwest
— Stacks withblocking-network-client
to provide support for HTTP/S using reqwest, and implies blocking networking as a whole, making thehttps://
transport available.blocking-http-transport-reqwest-rust-tls
— Stacks withblocking-http-transport-reqwest
and enableshttps://
via therustls
crate.blocking-http-transport-reqwest-rust-tls-trust-dns
— Stacks withblocking-http-transport-reqwest
and enableshttps://
via therustls
crate. This also makes use oftrust-dns
to avoidgetaddrinfo
, but note it comes with its own problems.blocking-http-transport-reqwest-native-tls
— Stacks withblocking-http-transport-reqwest
and enableshttps://
via thenative-tls
crate.
§Performance
The reason these features exist is to allow optimization for compile time and optimize for compatibility by default. This means that some performance options around SHA1 and ZIP might not compile on all platforms, so it depends on the end-user who compiles the application to chose these based on their needs.
-
max-control
— Activate features that maximize performance, like using threads, but leave everything else that might affect compatibility out to allow users more fine-grained control over performance features like whichzlib*
implementation to use. No C toolchain is involved. -
max-performance-safe
(enabled by default) — Activate features that maximize performance, like usage of threads, `and access to caching in object databases, skipping the ones known to cause compile failures on some platforms. Note that this configuration still uses a pure Rust zlib implementation which isn’t the fastest compared to its C-alternatives. No C toolchain is involved. -
parallel-walkdir
— If set, walkdir iterators will be multi-threaded which affects the listing of loose objects and references. Note, however, that this will userayon
under the hood and spawn threads for each traversal to avoid a global rayon thread pool. Thus this option is more interesting to one-off client applications, rather than the server. -
hp-tempfile-registry
— The tempfile registry uses a better implementation of a thread-safe hashmap, relying on an external crate. This may be useful when tempfiles are created and accessed in a massively parallel fashion and you know that this is indeed faster than the simpler implementation that is the default. -
parallel
— Make certain data structure threadsafe (orSync
) to facilitate multithreading. Further, many algorithms will now use multiple threads by default.If unset, most of
gix
can only be used in a single thread as data structures won’t beSend
anymore. -
pack-cache-lru-static
— Provide a fixed-size allocation-free LRU cache for packs. It’s useful if caching is desired while keeping the memory footprint for the LRU-cache itself low. -
pack-cache-lru-dynamic
— Provide a hash-map based LRU cache whose eviction is based a memory cap calculated from object data. -
max-performance
— Activate other features that maximize performance, like usage of threads,zlib-ng
and access to caching in object databases. Note that some platforms might suffer from compile failures, which is whenmax-performance-safe
should be used. -
fast-sha1
— If enabled, use assembly versions of sha1 on supported platforms. This might cause compile failures as well which is why it can be turned off separately. -
zlib-ng
— Use the C-based zlib-ng backend, which can compress and decompress significantly faster. Note that this will cause duplicate symbol errors if the application also depends onzlib
- usezlib-ng-compat
in that case. -
zlib-ng-compat
— Use zlib-ng via its zlib-compat API. Useful if you already need zlib for C code elsewhere in your dependencies. Otherwise, usezlib-ng
. -
zlib-stock
— Use a slower C-based backend which can compress and decompress significantly faster than the rust version. Unlikezlib-ng-compat
, this allows using dynamic linking with systemzlib
libraries and doesn’t require cmake.
§Other
The catch-all of feature toggles.
-
tracing
— Enable tracing using thetracing
crate for coarse tracing. -
tracing-detail
— Enable tracing using thetracing
crate for detailed tracing. Also enables coarse tracing. -
verbose-object-parsing-errors
— When parsing objects by default errors will only be available on the granularity of success or failure, and with the above flag enabled details information about the error location will be collected. Use it in applications which expect broken or invalid objects or for debugging purposes. Incorrectly formatted objects aren’t very common otherwise. -
serde
— Data structures implementserde::Serialize
andserde::Deserialize
. -
progress-tree
— Re-export the progress tree root which allows to obtain progress from various functions which takeimpl gix::Progress
. Applications which want to display progress will probably need this implementation. -
cache-efficiency-debug
— Print debugging information about usage of object database caches, useful for tuning cache sizes. -
regex
— For use in rev-parse, which provides searching commits by running a regex on their message.If disabled, the text will be search verbatim in any portion of the commit message, similar to how a simple unanchored regex of only ‘normal’ characters would work.
Re-exports§
pub use gix_actor as actor;
pub use gix_attributes as attrs;
attributes
pub use gix_command as command;
command
pub use gix_commitgraph as commitgraph;
pub use gix_credentials as credentials;
credentials
pub use gix_date as date;
pub use gix_dir as dir;
dirwalk
pub use gix_features as features;
pub use gix_fs as fs;
pub use gix_glob as glob;
pub use gix_hash as hash;
pub use gix_hashtable as hashtable;
pub use gix_ignore as ignore;
excludes
pub use gix_lock as lock;
pub use gix_negotiate as negotiate;
credentials
pub use gix_object as objs;
pub use gix_object::bstr;
pub use gix_odb as odb;
pub use gix_prompt as prompt;
credentials
pub use gix_protocol as protocol;
gix-protocol
pub use gix_ref as refs;
pub use gix_refspec as refspec;
pub use gix_revwalk as revwalk;
pub use gix_sec as sec;
pub use gix_tempfile as tempfile;
pub use gix_trace as trace;
pub use gix_traverse as traverse;
pub use gix_url as url;
pub use gix_utils as utils;
pub use gix_validate as validate;
Modules§
- dirwalk
dirwalk
- Utilities to handle program arguments and other values of interest.
- filter
attributes
lower-level access to filters which are applied to create working tree checkouts or to ‘clean’ working tree contents for storage in git. - index
index
Feature Flags - Process-global interrupt handling
- mailmap
mailmap
- Run computations in parallel, or not based the
parallel
feature toggle. - pathspec
attributes
Pathspec plumbing and abstractions - Revisions is the generalized notion of a commit.
- Not to be confused with ‘status’.
- status
status
- submodule
attributes
Submodule plumbing and abstractions - Type definitions for putting shared ownership and synchronized mutation behind the
threading
feature toggle.
Structs§
- Attribute
Stack excludes
orattributes
A utility to access.gitattributes
and.gitignore
information efficiently. - A blob along with access to its owning repository.
- A decoded commit object with access to its owning repository.
- The head reference, as created from looking at
.git/HEAD
, able to represent all of its possible states. - An
ObjectId
with access to a repository. - A decoded object with a reference to its owning repository.
- A detached, self-contained object, without access to its source repository.
- Pathspec
attributes
A utility to make matching against pathspecs simple. - Pathspec
Detached attributes
LikePathspec
, but without a Repository reference and with minimal API. - A reference that points to an object or reference, with access to its source repository.
- A remote which represents a way to interact with hosts for remote clones of the parent repository.
- A thread-local handle to interact with a repository from a single thread.
- Submodule
attributes
A stand-in for the submodule of a particular name. - A decoded tag object with access to its owning repository.
- An instance with access to everything a git repository entails, best imagined as container implementing
Sync + Send
for most for system resources required to interact with agit
repository which are loaded in once the instance is created. - A decoded tree object with access to its owning repository.
- A URL with support for specialized git related capabilities.
- A worktree checkout containing the files of the repository in consumable form.
- A borrowed reference to a hash identifying objects.
Enums§
- An owned hash identifying objects, most commonly
Sha1
Traits§
- A thread-safe read-only counter, with unknown limits.
- An object-safe trait for describing hierarchical progress.
- A trait for describing hierarchical progress.
- A trait for describing non-hierarchical progress.
Functions§
- See
ThreadSafeRepository::discover()
, but returns aRepository
instead. - See
ThreadSafeRepository::init()
, but returns aRepository
instead. - See
ThreadSafeRepository::init()
, but returns aRepository
instead. - See
ThreadSafeRepository::open()
, but returns aRepository
instead. - See
ThreadSafeRepository::open_opts()
, but returns aRepository
instead. - Create a platform for configuring a clone with main working tree from
url
to the localpath
, using default options for opening it (but amended with using configuration from the git installation to ensure all authentication options are honored). - Create a platform for configuring a bare clone from
url
to the localpath
, using default options for opening it (but amended with using configuration from the git installation to ensure all authentication options are honored).
Type Aliases§
- A handle for finding objects in an object database, abstracting away caches for thread-local use.
- The standard type for a store to handle git references.